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Abstract 

We put forward a formal model of anonymous systems. And we concentrate on the anony- 
mous failure detectors in our model. In particular, we give three examples of anonymous failure 
detectors and show that 

• they can be used to solve the consensus problem; 

• they are equivalent to their classic counterparts. 

Moreover, we show some relationship among them and provide a simple classification of 
anonymous failure detectors. 



'Some proofs are in the appendix. 

' Email: danielliy@gmail.com. Department of Computer Science and Engineering, The Chinese University of Hong 
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1 Introduction 



1.1 Background 

The consensus problem |CHT96) is now recognized as one of the most important problems 
to solve when one has to design or to implement reliable applications on top of an unreliable 
asynchronous distributed system. As it is impossible to implement consensus even with one 
faulty process |FLP85j , one of the solutions to this concern is to turn to the concept of failure 
detectors. In |CT96j , the concept of unreliable failure detectors is introduced and used to solve 
the consensus problem in asynchronous systems. In [DGFG02] . the weakest failure detector 
for solving consensus in the message-passing model is proved to be f2 and oW with a majority 
of correct processes. In |LH94j . in the shared-memory model, fi and oW are the weakest 
failure detector for solving consensus in any environment. The difference between |DGFG02j 
and |LH94| is that |LH94| has a stronger abstraction, i.e. register, in the process of implementing 

the consensus problem. 

Further, in jUGFClO) and [DGFG+04] . a new kind of failure detectors, S, is introduced, 
which can be used to implement register. Consequently, the weakest failure detector for solv- 
ing consensus is actually (fi,£). In |DGFG02] . the realistic failure detectors are considered, 
in |MR99) . the generic protocol for solving consensus is brought forward, and in [Zie07] . the 
eventual failure detectors are classified. Particularly, in [JT07] , the fact that every problem has 
a weakest failure detector is shown. 

|BR10j studies the failure detectors in an anonymous system, where the processes have 
no identity. Nevertheless, it does not provide a mathematical characterization of anonymity, 
the central concept in the paper, which results in the vagueness of the anonymous system. In 
this paper, we address the question of anonymity. Specifically, we provide a rigorous model for 
anonymous systems and show several results in our model. 

1.2 A Formal Model for Anonymous Systems 

We use N, the set of natural numbers {0, 1,2,.. .}, to denote the range of the clock's ticks. P 
means a set of n processes {pi,P2, ■ . ■ ,p n }- F : N — > 2 P is a failure pattern, i.e. the set of 
processes crashed at a certain time. An environment £ is a set of possible failure patterns. The 
processes can only fail by crashing (halting permanently). We assume that at least one process 
is correct in our model. Each process is connected to every other process via a reliable channel 
and message delays on these channels are unbounded but finite. 

Communication can be based on the broadcast primitive (the same as in the classical system) 
and an anonymous receive operation (to be introduced later). We characterize the anonymity 
of the system by using a permutation function. Suppose that II is a permutation function 
mapping from P to P. That is to say, II is a permutation of all the processes. Let 71 be a 
possibly infinite range of values that are sent to each other by the processes, R be a function 
mapping P x N to 1Z, and Ri(j,t) be the value that process i receives from another process j 
at time t. Intuitively, the anonymous receive operation means that the receiver cannot tell who 
sends out the message. Mathematically, the anonymous receive operation is defined by a i? n 
function: 

RYU,t) = Ri(n(j),t). 

The system above is called an anonymous message-passing system. The anonymous shared- 
memory system is introduced in the appendix. 

1.3 A Formal Definition of Anonymous Failure Detectors 

Based on our formal model of anonymous systems, we introduce the formal definition of anony- 
mous failure detectors. We define crashed(F) (faulty processes) to be Ut^F(t) and correct(F) 
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(correct processes) to be P — crashed(F). Moreover, \crashed(F)\ means the number of faulty 
processes and \correct(F)\ represents the number of correct processes. 

A failure detector history H with range V is a function from P x N to V. H(p,t) is the 
value of the failure detector module of process p at time t. A failure detector I? is a function 
that maps each failure pattern F to a set of failure detector histories H-p with range Vp (where 
V-d denotes the range of failure detector outputs of V). V(F) denotes the set of possible failure 
detector histories permitted by T> for the failure pattern F. We define F u (t) = H(F(t)),\/t <E N 
and H n (p,t) = H(U(p), i),Vp 6 F,( £ N. If VTI, H u e V(F n ), then V is called a anonymous 
failure detector. 

1.4 Examples of Anonymous Failure Detectors 

In this part we introduce some examples of failure detectors under our model of anonymous 
systems. 

1.4.1 M 

Each failure detector module of M outputs a natural number in {0, 1,2, ... ,n}, which represents 
the number of processes suspected to have crashed. So the range of M is Vjy = {0,1,2,..., n). 

M{F) is the set of all failure detector histories i/jv with range Vrf that satisfies the following 
properties: 

• Completeness : Eventually the failure detector outputs a number that is greater than or 
equal to the actual number of crashed processes. 

3t £ M,W E N and t' > t,Vq G P,H N (q,t') > \crashed(F)\. 

• Accuracy : The (correct) failure detector always outputs a number that is smaller than 
or equal to the actual number of crashed processes. 

V< e N,Vq e correct(F),Htf(q,t) < \crashed(F)\. 

1.4.2 oAf 

Each failure detector module of of>f outputs a natural number in {0, 1,2,..., n}, which rep- 
resents the number of processes suspected to have crashed. So the range of Af is V^v = 
{0,1,2,...,*}. 

oAf(F) is the set of all failure detector histories H j^ with range V _\f that satisfies the 
following properties: 

• Completeness : Eventually the failure detector outputs a number that is greater than or 
equal to the actual number of crashed processes. 

3t e N, Vt' e N and t' > t,Mq G P, H oA /(q,t') > \crashed(F)\. 

• Eventual Accuracy : Eventually, the output of (correct) failure detectors is a number that 
is smaller than or equal to the actual number of crashed processes. 

3t e N,V<' £ N and t' > t,Mq £ correct{F), H«x{q,t') < \crashed(F)\. 

1.4.3 e 

Each failure detector module of outputs a boolean value, i.e., true or false. So under this 
circumstance, the range of & is Vg> = {true, false}. 

0(F) is the set of all failure detector histories Hq with range Ve that satisfies the following 
property: 

• Eventual Self- Trust: There is a time after which there is only one correct process, which 
trusts itself. 

3t £ N,3p £ correct(F), s.t. Vi' G N and t' > t,H(t',p) = true,Mq e correct(F) - 
p,H@(t',q) = false. 
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2 Consensus Algorithms 

2.1 Consensus Problem 

In this part we briefly review the consensus problem |CT96| . which is defined by the following 
four properties: 

• Termination: Every correct process eventually decides some value. 

• Irrevocability (Integrity): Every process decides at most once. 

• Agreement: No two correct processes decide differently. 

• Validity: If a process decides v, then v was proposed by some process. 

2.2 Consensus Algorithm with J\f 

We assume that the number of processes that may crash is bounded by /. Our Af based 
consensus algorithm proceeds in / + 1 asynchronous rounds. In each round every process 
broadcasts its value. Then it blocks until it has received enough round-r messages. In this 
and the following sections we assume that the message system places all received messages 
in an multi-set called received. Since we only consider messages from the round a process 
is currently in, we implicitly delay messages from later rounds until their round starts. For 
memory efficiency, messages from previous rounds can be discarded at every round switch (i.e., 
whenever r is increased). 

Remark: Since algorithms typically wait for messages from alive processes, we deemed 
it more useful to use the converse of the failure detector described above. Moreover, since it 
is always safe to wait for messages from n — f processes, we consider oracles that output the 
number of processes believed to be alive, denoted by Af and oAf respectively. In the following, 
we will use Af and oAf to denote the output of AT and oAf respectively. 

Algorithm 1 Consensus on v with J\f 
1: v £ {0, 1} initially the input value 
2: for r from 1 to / + 1 do 
3: broadcast {Propose, v, r) 

4: wait until received contains (Propose, v, r) at least Af times 
5: values <— {v} U {v' : (Propose, v', r) € received} 
6: v <— max v' € values 
7: end for 
8: decide v 



In this part, we show that consensus is solvable among n > / + I processes. To this end we 
start out with a lemma that shows that processes will never give up on the value 1 once they 
have adopted it. 

Lemma 2.1 (Stubbornness). If some correct process p adopts v i— 1 in some round r or p 
initially (r = 0) proposes 1, then p will have v = 1 for all rounds r' > r. 

With this intermediate step, showing Validity becomes quite simple: 

Lemma 2.2 (Validity). If a process decides v using Algorithm 1, then v was proposed by some 
process. 

The proof of agreement is patterned around the idea, that among the / + 1 rounds there 
must be one round during which no process crashes. We will show that this is enough for all 
processes to reach states such that all preferred values are equal and that once this is the case, 
no process can decide on another value, since no variable or message will ever carry the other 
value in later rounds. 
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Lemma 2.3. If there exists one round, say r, which no process is in when crashing all processes 
will set their v to the same max(values) by the end of that round, and decide on this v. 

Observing that there are / + 1 rounds but at most / processes can crash, it is evident that: 

Observation 1. In executions with f + 1 rounds, where at most f processes can crash there is 
at least one round in which no processes crash. 

Agreement is evident from the Lemma |2 .31 and ObservationQ] Validity was shown in Lemma 
12.21 Irrevocability follows trivially from the algorithm, and Termination follows from the fact 
that the algorithm can never get stuck in a round, since the number of received messages must 
eventually be greater or equal to the output of A/" in every round. 

Theorem 2.4. Algorithm 1 allows n > f processes to reach Consensus. 
2.3 Consensus Algorithm with oJ\f 

We show that consensus is possible among n > 2/ processes when we augment our basic 
asynchronous model with oAf. 

Algorithm 2 Consensus algorithm with oJ\f 
1: v G {0, 1} initially the input value 

2: lock <— ?, decided <— false, r <— 

3: loop 

4: broadcast (Propose, r, v) 

5: wait until received contains (Propose, r, _) at least oM times 

6: proposed {w : (Propose, r, w) G received} 

7: v <— mva(proposed) 

8: if proposed — {v} = then 

9: lock <— v 

10: else 

11: lock <-? 

12: end if 

13: broadcast (Lock, r, lock, v) 

14: if decided then 

15: halt 

16: end if 

17: wait until received contains (Lock, r, _, _) at least oN times 

18: locked {w : (Lock, r,w,) G received} 

19: if locked - {?} ^ then 

20: v ^— min(locked — {?}) 

21: if locked — v = then 

22: decide v 

23: decided true 

24: end if 

25: else 

26: proposed <(— {w : (Lock, r, _, w) G received} 

27: v <— min(proposed) 

28: end if 

29: r <r- r + 1 

30: end loop 
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Lemma 2.5. When some process p sends a lock message for some value, say x 7^?, in round 
r, then no other process can send a lock message for some other value y ^ {x, ?}. 

Lemma 2.6. In Algorithm 2, let round r be the first round where some process decides, say on 
x, then (1) no other process can decide a different value in round r, and (2) all other processes 
will decide x at most one round later. 

What remains to be shown is that there will eventually be a round in which one process is 
able to decide. 

Lemma 2.7. In every execution of Algorithm 2 there eventually is a round Td where at least 
one process decides. 

Lemma 2.8. Algorithm 2 guarantees Validity and Integrity. 

Theorem 2.9. Algorithm 2 solves consensus in anonymous asynchronous systems augmented 
with oN when n > 2/. 

2.4 Consensus Algorithm with 

The difference between f2 and is that with only the eventual leader learns its role 
directly from the oracle. The most important difference is that in our algorithm only the leader 
sends a (Leader, r, _) message. This (and that n > 2f) ensures that processes cannot make 
too much progress independently. First, however, we prove that all processes actually do make 
progress. 

Lemma 2.10. No correct process blocks forever in a round. 

Next we show that each process has to decide, by showing that when one or more process 
decides first then all other processes must decide later on, since deciding is always triggered by 
a (Decide, _) message which processes forward before deciding. Then we prove that there is at 
least one first process to send that message after the leader has stabilized. Obviously the first 
part also holds if the (Decide, _) message is sent before stabilization. 

Lemma 2.11. Every correct process decides. 

Through our final Lemma it will become evident that Agreement must hold: 

Lemma 2.12. It is impossible for (Decide, w) and (Decide, w') with w ^ w' to be sent. 

Since processes decide on the value received via a (Decide, v) message, Agreement follows 
from the previous Lemma, Termination from Lemma l2.111 Integrity from the fact that processes 
halt immediately after deciding and Validity from the fact that all values ever sent, can be easily 
traced back to an initial value of some process v. Therefore we have: 

Theorem 2.13. Algorithm 3 solves Consensus in asynchronous anonymous systems augmented 
with O, if n > 2f. 

3 Equivalence with Classical Failure Detectors 

In this section we investigate the relationship between our anonymous failure detectors and 
the classic ones (V,oV and fJ) |CHT96j |CT96j . To this end we have to assume that unique 
identifiers arc available and that every reception can be attributed to the sender. 

Firstly, we observe that the equivalence between Q and is obvious: to obtain one from 
the other it is sufficient for the process that trusts itself to simply tell the other processes, or 
for all processes but the leader elected by to simply ignore the failure detectors output. The 
translation of V to M and of oV to oJ\f are obvious as well: in both cases it suffices to output 
the number of processes which are not suspected. 

The remaining relations are explored in more detail via transformations. By VJ 7 we denote 
the asynchronous algorithm that implements T based on T>. Both transformations work by 
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Algorithm 3 Consensus algorithm with 
1: v € {0, 1} initially the input value 
2: r ^— 

3: Code for processes p: 

4: loop 

5: wait until O = true or received contains (Leader, r, _) at least once 
6: if received(Leader, r, w) then 
7: v <— w 
8: else 

9: if if = true then 
10: broadcast(Leader,r,v) 
11: end if 
12: end if 

13: broadcast(Report, r, v) 

14: wait until received contains (Report, r, _) at least (n — /) times 

15: if 3w : received(Report,r, _) from > n/2 processes then 

16: aux w 

17: else 

18: aux ? 

19: end if 

20: broadcast(Vote, r, aux) 

21: wait until received contains (Vote, r, _) at least (n — /) times 
22: if received(V ote, r, aux') with aux' 7^? then 
23: v <— aux' 
24: end if 

25: if receivediVote, r, aux') with aux' 7^? at least n — f times then 
26: broadcast (Decide, v) 
27: end if 
28: r <— r + 1 
29: end loop 

30: upon reception of (Decide, v) do 
31: broadcast (Decide, v) 
32: decide decision 
33: halt 
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building a estimate of the alive processes, denoted by AL, and then suspecting all processes 
that are not in this set, i.e., P — AL, where P denotes the set of all processes. 

Algorithm 4 oN oV Implementation 
1: Code for processes p: 
2: r <r- 
3: suspect <— 

4: loop 

5: r <— r + 1 

6: broadcast (ALIVE, r) 

7: wait until received oN (ALIVE, r) messages from the set AL 
8: suspect P — AL 

9: end loop 



The implementation of oJ\f o V (Algorithm 4) is quite simple. Since only eventual Strong 
Accuracy is required, it suffices to output those processes that did not send (Alive, _) messages 
in the current round. Thus wrong suspicions can only occur in rounds where the crashed 
processes have sent messages before crashing, and these messages are faster than those from 
alive processes. In some later round, this wrong is [eventually] corrected, due to the absence of 
a message from the crashed process. 

Let us now reiterate the properties of V and oV: 

• Strong Completeness. Eventually ever process that crashes is permanently suspected by 
every correct process. 

• Strong Accuracy. No process is suspected before it crashes. 

• Eventual Strong Accuracy. There is a time after which correct processes are not suspected 
by any correct process. 

Theorem 3.1. The implementation oJ\foV (Algorithm 4) guarantees Strong Completeness and 
Eventual Strong Accuracy. 



Algorithm 5 MV Implementation 
1: Code for processes p: 
2: r <- 

3: suspect ^— 0; earlieralive ^— 0; lastchange <— 

4: loop 

5: broadcast(ALIF£;, r) 

6: wait until received (ALIVE, r) messages from some set AL and \AL\ = M 

7: if AL ^ earlieralive then 

8: lastchange r 

9: else 

10: if r > lastchange + / + 2 then 

11: suspect ^— P — AL 

12: end if 

13: end if 

14: earlierlive ^— AL 

15: r = r + 1 

16: end loop 
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Since V is not allowed to make wrong suspicions, we have to make sure that AL always 
contains all processes that have not crashed whenever we update suspect. To ensure this we 
wait until this set does not change for f+2 rounds. Before we turn to proving that our translation 
guarantees Strong Accuracy and Strong Completeness, we show that all alive processes proceed 
through the rounds in a somewhat coordinated way: 

Lemma 3.2. At any time, the difference between the round numbers of two alive processes p 
and q is smaller than or equal to f + 1 . 

Lemma 3.3. The Translation AfV guarantees Strong Accuracy. 
Lemma 3.4. The Translation AfV guarantees Strong Completeness. 

From the lemmas above it follows immediately that 
Theorem 3.5. The Translation AfV guarantees Strong Completeness and Strong Accuracy. 

4 A Simple Classification and Reductions 

4.1 A Formal Model for Reductions among Anonymous Failure De- 
tectors 

We say an anonymous failure detector V can be reduced to another failure detector T> (V is 
stronger than V) if there is an algorithm T-p^v that transforms T> to V under an environment 
£. Tp_j.p' (using T>) maintains a variable outputp at every process p, which emulates the output 
of D' at p. Let O be the history of all the output variables and we require O n € V(F n ), where 
F £ £ and II is a permutation function of all the processes. 

4.2 A Simple Classification 

In short, Af and oA/" belong to the same class of failure detectors, the symmetric failure detectors. 
is another class of failure detectors, the unsymmetrical failure detectors. The difference of the 
two kinds of failure detectors is that the symmetric failure detector outputs the same information 
at all correct processes while the unsymmetrical failure detectors do not. 

This is a very simple classification. In an anonymous system, a process cannot distinguish 
other processes and only knows itself. So usually unsymmetrical failure detectors output one 
value at a specific process and output some other value at the rest of processes. 

4.3 Reductions among Anonymous Failure Detectors 

Now lets talk about the relations of the anonymous failure detectors mentioned above. 
Theorem 4.1. Af is stronger than oAf. 

Proof. They both have the property of completeness. The accuracy property of Af clearly 
implies the eventual property of oAf while the reverse is not true. □ 

Theorem 4.2. Af and O are incomparable. 

Proof. Obviously, there is no deterministic reduction from O to Af as there is simply no way 
to break the symmetry in Af. Further, there is no deterministic reduction from Af to 0. By 
contradiction, assume that there exists a reduction algorithm A such that for each failure pattern 
F and failure detector history H £ Q(F), A outputs a failure detector history H' £ Af(F). 
Denote q to be a correct process. Since Af can be implemented, there should exists to £ Af 
such that after to, i.e. for t > t , H' £ Af(F) should satisfy completeness and accuracy, i.e. 
Hj\r(q,t) = \crashed(F)\. Without the loss of generality, suppose that the only process that 
trusts itself in the 8 output is process p\. Then we let p\ be silent until t\ > t , then at time t2 
that to < t% < tx, the run of the algorithm cannot distinguish the circumstance that p\ is slow 
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from the one that p\ is dead. Therefore, ifjv(g, £2) > \crashed(F)\, which violates the property 
of accuracy in the definition of TV. 

□ 

Theorem 4.3. oJ\f and are incomparable. 

Proof. Following a similar argument in the proof of the previous theorem, it is easy to show 
that OjV cannot be reduced to 9. Also, there is no reduction from to oAf due to symmetry 
reasons. □ 

4.4 Randomized Reductions 

The reductions discussed above are deterministic reductions. We also define the notion of 
randomized reduction. Instead of requiring O n G D^F 11 ), we allow the use of randomness and 
only require O n G T>'(F U ) to be correct with probability at least 2/3. We show an example of 
randomized reductions. 

Theorem 4.4. Under randomized reduction, M is stronger than 0. 

Proof. We will show a reduction algorithm converting M to 0. The converse is not possible 
due to the proof of Theorem 14.21 

At first, for each process pi, it randomly generates a real number. In theory, the number 
of real numbers are infinite and so the chance for two processes that get the same real number 
is 0. However, in practice, it may be hard to generate an infinite number of numbers. So here 
the pool may be finite and there always exists the probability that two processes get the same 
real number. However, if we let the pool to be large enough, much larger than the number of 
processes, then the probability for two processes to get the same real number is extremely low 
and will tend to if the pool is going to infinity. 

This is where the randomness lies. Then we can assume that each process has a distinct real 
number. In the following process, when it tries to broadcast, it should include this real number. 
You may think that we have return to the situation of the classic systems. In some sense this 
thinking is right and some other sense, it is not. Although p\ receives distinct real numbers, it 
cannot tell whether a real number it receives, is from p%,P3, ■ ■ ■ , or p n . Thus this is consistent 
with the definition of the anonymous system model. 

Then why do we say that in some sense we just return to the classic systems? This can be 
attributed to the anonymous model. The anonymity is that our real numbers are a permutation 
of the process id's, in which sense we cannot distinguish the processes. Nonetheless, if we 
treat the real numbers as the identifiers, then the anonymity just disappears. Then we can 
comfortably utilize the CHT proof to extract a certain real number corresponding to a certain 
process that we do not know. Each process at this stage can judge if the extracted real number 
is its own initial value. If the answer to this is affirmative, then this process is just the process 
we seek in the failure detector. 

□ 

5 Concluding Remarks 

In summary, we provide a rigorous model for anonymous systems and discuss some issues related 
to failure detectors under our model. We hope that further problems and notions can be brought 
forward in our model. For instance, more examples of failure detectors can be shown. Moreover, 
we believe that the major open problem in the anonymous system is the weakest anonymous 
failure detector for consensus. It may be hard to know the weakest anonymous failure detector 
in a deterministic sense; so we have defined randomized reduction. We hope that the weakest 
anonymous failure detector for consensus under randomized reduction can be easier. 
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A A Formal Model of Anonymous Shared-Memory Sys- 
tem 



As in the anonymous message-passing system, there is a set of processes P = {p\,p2, . . . ,p n } in 
the anonymous shared-memory system. In addition, there are m objects O = {01,02, . . . ,o m }. 
We assume that each process has a (possibly infinite) state machine and a set of states, one of 
which is the initial state. Each state q of process p has three special fields: 

• q.obj, the object to be accessed next, or null 

• q.op, the operation on q.obj to be executed 

• q.in, the input parameter (if any) of q.op 

We use a permutation function II, which maps P to P. The configuration C of the system is 
the states of all processes and the values of all shared objects, i.e. vector (q\, tfe, • • • , q n , v%, v 2, ■ ■ ■ , v m ). 
We define qi to be the state of H(pi), for i = 1,2, ... ,n, and v m to be the value of o m , for 
m = 1,2, ... ,m. The function / is the state transition machine from some state and value 
(q, v) to some state q' . 

Therefore, just like what we did in the anonymous message-passing model, we also use 
permutation to characterize the anonymity in the anonymous shared-memory system. 

B Proofs for Algorithm 1 
B.l Proof of Lemma 12.11 

A process p always adds its current value to values in line 5, and always chooses the maximum 
of all values in line 6, therefore once 1 was adopted it will always remain the maximum (since 
and 1 are the only possible values) and the Lemma follows. 

B.2 Proof of Lemma 12721 

Since we are considering binary consensus only, there are only two cases where Validity could 
be violated: (1) Either some process p decided 1 when all processes had as their initial value, 
or (2) some process p decided and all processes had 1 as their initial value. In case (1) p 
must have received 1 from some other process at some point, otherwise 1 cannot become a 
member of values, and p initially proposes by assumption. Since all processes only send their 
current estimate v, some process must have initially proposed 1, which is a contradiction to the 
assumption of (1). The impossibility of case (2) follows from Lemma \2. II 

B.3 Proof of Lemma [2731 

Let f r -i denote the number of processes that have crashed up to and including round r — 1. 
When no processes crash during round r, the processes will wait for messages from n — f r -i 
processes, since M will never output a number smaller that the number of alive processes. This, 
however, implies that all processes get the same set of messages in round r, and thus they all 
have the same set of values in their respective values sets. Therefore the maximum of round r, 
denoted m r will be the same at all alive processes. Now agreement follows from the fact that 
no process can send a value v ^ m r in rounds r' > r, and therefore W > r : m r i — m r . Thus 
all processes will decide on m r . 
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C Proofs for Algorithm 2 



C.l Proof of Lemma 12.51 

Since p sends (Lock,r,x,x) it cannot have received any propose messages for any other value 
(otherwise it could not have reached line 9). Since processes wait for oJ\f > n — f > n/2 
messages, every other process must have received at least one (Propose, r, x) message, keeping 
them from reaching line 9 with v ^ x. Thus each processes either sends out a lock message for 
x or (Lock, r, ?, _). 

C.2 Proof of Lemma \2M 

Let p denote the deciding process; to be able to decide, p requires oAf(Lock, r, x, _) messages. 
From Lemma 12.51 it follows that no process q can have received sufficiently many messages to 
decide another value in this round. Therefore, (1) holds. 

Since p process has received oAf > n — / > / + 1 lock messages, any other process must have 
received at least one of these and thus all reach line 20 and calculate the same minimum, i.e., 
the only value x, and use this value as input for the next round. Since all alive processes now 
propose the same value in the proposal phase of round r + 1, all processes receive oTV messages 
containing x. This in turn results in all alive processes to lock x and send enough lock messages 
to force all processed (that did not decide in r and therefore terminated after broadcasting their 
lock messages) to decide in round r + 1, thereby ensuring (2). 

C.3 Proof of Lemma [23 

For the sake of contradiction assume otherwise, and let r a denote the first round where the 
all alive processes oJ\f is accurate at the start of the round (line 4). Moreover, let r c denote 
the first round after the round in which the last process crashed. Since we assume that no 
process ever decides, both of these rounds must exist. Let r<i = max{r a ,r c }, then all processes 
will receive all the messages from all alive processes, in all rounds r > ra- Therefore all must 
calculate the same minimum, say x, in line 7 from on. If all values received were the same, 
this value is also locked by all processes and therefore decided on in the second phase, leading 
to a contradiction, since there is a decision. So assume that there where different values in the 
propose messages, and thus proposed — x ^ ; in line 8, which results in locked to contain ? 
in line 19, i.e., no decision is possible in this round. However, all processes set v 4— x in line 
27. Now all processes have the same input value at the start of round Td + 1, which leads to all 
processes to decide x by the argument for (2) in Lemma 

C. 4 Proof of Lemma 12.81 

Since no process ever sets v to a value that is neither its own initial value nor a value received 
from another process, it follows trivially that when v is decided on, this value must be some 
processs input value, thereby ensuring Validity. Finally, Integrity follows from halting in line 
15 in the round after deciding. 

D Proofs for Algorithm 3 

D. l Proof of Lemma I2TT01 

The proof is by contradiction. Let r be the smallest round in which a process blocks forever. 
Blocking can only occur at one of the three wait statements. We will show that it is impossible 
to wait forever for each of them. 

The first wait, requires that in phase r, no process will ever trust itself to be the leader, 
and thus not unblock itself directly and all other correct processes via a (Leader, r, _) message, 
contradicting the properties of 6. 
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Since no correct process can block forever at the first wait, all correct processes will eventually 
send a (Report, r, _) message thus unblocking all processes in the second wait (since at most / 
processes can crash and thus not send a (Report, r, _) message). The same is true about the 
third wait statement and (Vote,r,_) messages. 

Therefore, no correct process was blocked forever in each of the three waits and thus they 
will all start round r + 1, which contradicts the assumption that some processes will block in 
round r. 

D.2 Proof of Lemma I2JT1 

When one process decides, it has successfully broadcast a (Decide, v) message in the line before. 
This message will eventually arrive at every other process and cause it to decide (if it did not 
decide before). Thus, when one process decides, every correct process decides. 

We now prove that at least one process sends a (Decide, v) message. Due to Lemma ^.lOl and 
the properties of 0, eventually there is a round in which only one process sends a (Leader, r, v) 
message, since this message is eventually received by all correct processes. They will all set 
their v to the same value w, and thus n — f identical messages will be sent and received in the 
second phase and thus aux = w at all alive processes. And thus every alive process will send a 
(Decide, v) message (with the v = w). 

D.3 Proof of Lemma I27L21 

We show that when one of the two messages (w.l.o.g. (Decide, w)) is sent in round r, then (1) 
the other cannot be sent in r or in later rounds, and (2) all processes have v = w. 

First we note that in some given round aux cannot take two different non-? values, as only 
one value can reach a majority in a benign system. As the value sent out via (Decide, w) 
messages cannot be ?, it is clear that w = aux. Thus all (Decide,-) messages sent in r must 
contain the same value. 

Secondly, we observe that v contains the same variable at all correct processes at the end 
of round r since one process sending out an — / message in line 26 implies that every other 
process will receive at least one (Vote, r, aux 1 ) with aux' 7^? since n > 2f. From this it follows 
that no other value can ever be decided after round r (no other value will ever be proposed by 
a leader). 

E Proof of Theorem S3] 

Quite simply, eventually oAf will return the correct number of processes that have not crashed, 
let this time be denoted by t'. Eventually all messages from crashed processes have been 
delivered at all processes, let this time be denoted by t". Let further t = max(t',t") and r max 
the maximum round at time t. After t we know that the output of oAf is accurate, thus all 
processes can only succeed to set their round to some value r > r max + 1, if they have received 
(Alive, r—1) messages from all alive processes (which are the correct processes). These processes 
can be found in AL at each correct process. Therefore from this time on, the set of suspected 
processes we have suspect = P — AL = SF, where SF denotes the set of processes that have 
crashed. In other words p will permanently suspect all crashed processes. Thus we have shown 
that oAfoV implements Strong Completeness. Moreover, since no correct process is suspected 
in any round that starts after t it follows that Eventual Strong Accuracy is guaranteed as well. 



14 



F Proofs for Algorithm 5 



F.l Proof of Lemma SH 

Without loss of generality assume that round number of q is further advanced than the round 
number of p. We now assume by contradiction that q has reached round r p + / + 2 (with r p 
denoting ps round number). Obviously q did not receive a round r' message from p for any 
r' > r p . The only way for q to pass the wait statement in line 6 is for some other process to 
send its round r' message and then crash, thus eventually decreasing the output of TV at q. 
Since q has reached round r p + / + 2 this must have happened / + 1 times, which contradicts 
the definition of f as the maximum number of failures in any execution. 

F.2 Proof of Lemma [3731 

Assume by contradiction that some process p is put into the suspect set of some process q 
before it crashes. This requires that p is not in the alive processes set AL of q for / + 2 rounds. 
Moreover for p to turn up in q's suspect set the contents of AL cannot have changed for / + 2 
rounds. This also implies that M did not change for f + 2 rounds. Due to the accuracy property 
of TV (it never outputs a number smaller than the number of alive processes) this in turn implies 
that no process crashed while q performed the previous f + 2 rounds. Due to Lemma 13.21 no 
process that crashed before can have sent messages for all these rounds. Thus, p must be in AL 
contradicting the assumption that it was not. 

F.3 Proof of Lemma 13.41 

Since a faulty process only takes finitely many steps, it can only send messages for finitely many 
rounds. Therefore, there exists a round from which on no process receives messages from it, 
thus it is not part of the set AL (the alive processes estimate) at any process. This results in 
crashed processes to be eventually suspected. 
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